42 research outputs found

    Learning Hybrid Process Models From Events: Process Discovery Without Faking Confidence

    Full text link
    Process discovery techniques return process models that are either formal (precisely describing the possible behaviors) or informal (merely a "picture" not allowing for any form of formal reasoning). Formal models are able to classify traces (i.e., sequences of events) as fitting or non-fitting. Most process mining approaches described in the literature produce such models. This is in stark contrast with the over 25 available commercial process mining tools that only discover informal process models that remain deliberately vague on the precise set of possible traces. There are two main reasons why vendors resort to such models: scalability and simplicity. In this paper, we propose to combine the best of both worlds: discovering hybrid process models that have formal and informal elements. As a proof of concept we present a discovery technique based on hybrid Petri nets. These models allow for formal reasoning, but also reveal information that cannot be captured in mainstream formal models. A novel discovery algorithm returning hybrid Petri nets has been implemented in ProM and has been applied to several real-life event logs. The results clearly demonstrate the advantages of remaining "vague" when there is not enough "evidence" in the data or standard modeling constructs do not "fit". Moreover, the approach is scalable enough to be incorporated in industrial-strength process mining tools.Comment: 25 pages, 12 figure

    Unfolding-Based Process Discovery

    Get PDF
    This paper presents a novel technique for process discovery. In contrast to the current trend, which only considers an event log for discovering a process model, we assume two additional inputs: an independence relation on the set of logged activities, and a collection of negative traces. After deriving an intermediate net unfolding from them, we perform a controlled folding giving rise to a Petri net which contains both the input log and all independence-equivalent traces arising from it. Remarkably, the derived Petri net cannot execute any trace from the negative collection. The entire chain of transformations is fully automated. A tool has been developed and experimental results are provided that witness the significance of the contribution of this paper.Comment: This is the unabridged version of a paper with the same title appearead at the proceedings of ATVA 201

    Finding suitable activity clusters for decomposed process discovery

    Get PDF
    Event data can be found in any information system and provide the starting point for a range of process mining techniques. The widespread availability of large amounts of event data also creates new challenges. Existing process mining techniques are often unable to handle "big event data" adequately. Decomposed process mining aims to solve this problem by decomposing the process mining problem into many smaller problems which can be solved in less time, using less resources, or even in parallel. Many decomposed process mining techniques have been proposed in literature. Analysis shows that even though the decomposition step takes a relatively small amount of time, it is of key importance in finding a high-quality process model and for the computation time required to discover the individual parts. Currently there is no way to assess the quality of a decomposition beforehand. We define three quality notions that can be used to assess a decomposition, before using it to discover a model or check conformance with. We then propose a decomposition approach that uses these notions and is able to find a high-quality decomposition in little time. Keywords: decomposed process mining, decomposed process discovery, distributed computing, event lo

    Conformance checking using activity and trace embeddings

    Get PDF
    Conformance checking describes process mining techniques used to compare an event log and a corresponding process model. In this paper, we propose an entirely new approach to conformance checking based on neural network-based embeddings. These embeddings are vector representations of every activity/task present in the model and log, obtained via act2vec, a Word2vec based model. Our novel conformance checking approach applies the Word Mover’s Distance to the activity embeddings of traces in order to measure fitness and precision. In addition, we investigate a more efficiently calculated lower bound of the former metric, i.e. the Iterative Constrained Transfers measure. An alternative method using trace2vec, a Doc2vec based model, to train and compare vector representations of the process instances themselves is also introduced. These methods are tested in different settings and compared to other conformance checking techniques, showing promising results

    Discovery of frequent episodes in event logs

    Get PDF
    Lion's share of process mining research focuses on the discovery of end-to-end process models describing the characteristic behavior of observed cases. The notion of a process instance (i.e., the case) plays an important role in process mining. Pattern mining techniques (such as frequent itemset mining, association rule learning, sequence mining, and traditional episode mining) do not consider process instances. An episode is a collection of partially ordered events. In this paper, we present a new technique (and corresponding implementation) that discovers frequently occurring episodes in event logs thereby exploiting the fact that events are associated with cases. Hence, the work can be positioned in-between process mining and pattern mining. Episode discovery has its applications in, amongst others, discovering local patterns in complex processes and conformance checking based on partial orders. We also discover episode rules to predict behavior and discover correlated behaviors in processes. We have developed a ProM plug-in that exploits efficient algorithms for the discovery of frequent episodes and episode rules. Experimental results based on real-life event logs demonstrate the feasibility and usefulness of the approach

    Verification of Logs - Revealing Faulty Processes of a Medical Laboratory

    Full text link
    Abstract. If there is a suspicion of Lyme disease, a blood sample of a patient is sent to a medical laboratory. The laboratory performs a number of dierent blood examinations testing for antibodies against the Lyme disease bacteria. The total number of dierent examinations depends on the intermediate results of the blood count. The costs of each examination is paid by the health insurance company of the patient. To control and restrict the number of performed examinations the health insurance companies provide a charges regulation document. If a health insurance company disagrees with the charges of a laboratory it is the job of the public prosecution service to validate the charges according to the regulation document. In this paper we present a case study showing a systematic approach to reveal faulty processes of a medical laboratory. First, files produced by the information system of the respective laboratory are analysed and consolidated in a database. An excerpt from this database is translated into an event log describing a sequential language of events performed by the information system. With the help of the regulation document this language can be split in two sets- the set of valid and the set of faulty words. In a next step, we build a coloured Petri net model corre-sponding to the set of valid words in a sense that only the valid words are executable in the Petri net model. In a last step we translated the coloured Petri net into a PL/SQL-program. This program can automat-ically reveal all faulty processes stored in the database.

    Efficient Process Model Discovery Using Maximal Pattern Mining

    Get PDF
    In recent years, process mining has become one of the most important and promising areas of research in the field of business process management as it helps businesses understand, analyze, and improve their business processes. In particular, several proposed techniques and algorithms have been proposed to discover and construct process models from workflow execution logs (i.e., event logs). With the existing techniques, mined models can be built based on analyzing the relationship between any two events seen in event logs. Being restricted by that, they can only handle special cases of routing constructs and often produce unsound models that do not cover all of the traces seen in the log. In this paper, we propose a novel technique for process discovery using Maximal Pattern Mining (MPM) where we construct patterns based on the whole sequence of events seen on the traces—ensuring the soundness of the mined models. Our MPM technique can handle loops (of any length), duplicate tasks, non-free choice constructs, and long distance dependencies. Our evaluation shows that it consistently achieves better precision, replay fitness and efficiency than the existing techniques

    Compositional design and verification of component-based information systems

    Get PDF
    Information systems have to support more and more complex organizations and the cooperation between organizations. The functionality of these systems is divided in components: each component has its own dedicated set of functionality. Whereas in past years, design and verification mostly focused on the internal aspects of a component, like the data aspect and behavioral aspect, the focus nowadays shifts more and more to the design and verification of the interaction between systems. Different organizations provide systems that need to communicate. Specifically, an organization may allow its components to be used by systems of other organizations. This way, an inter organizational network of communicating components is formed. One of the main aspects of such a network is that organizations do not want to share with whom they are communicating. This way, the individual systems form a, possibly unknown, large scale ecosystem: a dynamic network of communicating components. These systems communicate via messages: a component requests a service from another component, which in turn eventually sends its answer. Hence, communication between the components is asynchronous by nature. Verification of asynchronously communicating systems is known to be a hard problem. In this thesis, we develop a framework to design large scale component-based information systems in which components communicate asynchronously. The framework allows for verification of local conditions for termination of the complete system. The formal foundation of the framework is Petri nets, in which communication is asynchronous by nature. Classical Petri nets can be used both for modeling the internal activities of a component, as well as for the interaction between components. We focus on soundness of systems: a system should always have a possibility to terminate. We propose sufficient criteria for compositional verification of soundness: if each component in the system is sound, and each pair of asynchronously communicating components satisfies some condition, the whole system is sound. The framework provides methods to design components that are sound by construction. The method uses soundness preserving refinements of Petri net places in different components by pairs of sound subcomponents. Data can be used to enrich the behavioral aspect, the control flow, of an information system, and data is used to store and present information to the users of the system. Classical Petri nets only focus on the ordering of activities. To integrate the data aspect and behavioral aspect of components, we define a sub class of coloured Petri nets, which is on the one hand expressive enough to model the flow and correlation of objects and messages, and on the other hand the possibility of verification remains. All techniques are combined in a design approach for the development of component based information systems. The approach uses the framework to develop a formal specification from user requirements. The developed specification is directly usable as a prototype, as it has execution semantics. The tool "Yasper" is developed to support the approach. Process mining techniques can be used to support the design process of component based information systems, by extracting internal aspects of a component, like data, resources and control flow. In the thesis, we present a process discovery algorithm based on integer linear programming, which can be used for this purpose, as it can handle negative instances that describe undesired behavior
    corecore